Add a mode to the renderer to substitute *anything* outside of the ASCII space#402
Add a mode to the renderer to substitute *anything* outside of the ASCII space#402estebank wants to merge 1 commit intorust-lang:mainfrom
Conversation
…CII space When encountering some non-printable Unicode characters, we replace them with printable representations of them. This replacement still relies on Unicode support on the user's terminal. In order to enable use on... less modern systems, added a `force_ascii` mode to the renderer that replaces anything outside of the ASCII with a replacement string: ``` error: oops --> <current file>:2:8 | 2 | Second oops <SOH> line | ^^^^ oops ``` This change unearthed a latent bug where the rendered snippet gets out of sync with the `Annotation` lo and hi char position, making highlighting spans that had characters replaced with a different number of bytes to be improperly highlighted.
|
Generally, I find it helpful to discuss things first in issues so conversations don't get fragmented and can be found where people expected them. I just noticed we don't have the contrib guide specifically requesting that.
Could you provide more context on the end-user concerns that this is addressing? What types of terminals are we talking about? Why is this coming up now? |
|
My apologies for dumping the PR over the wall. I was looking at a separate feature (a text only output that could be suitable for Text-To-Speech users) and bumped onto the FIXMEs I'd left a few years back about ASCII support. The rationale for rustc's output to remain ASCII-only for so long was wanting to support the broadest possible userbase, regardless of terminal they were running. In the past 10 years, even though support for "advanced" features in terminals has improved, it is still all over the place, as you well know. The thinking about not enforcing that all the rustc output is pure ASCII was with the understanding that someone writing/reading code that uses UTF-8 in a terminal that doesn't support it will already be expecting the compiler's output to be garbled. But as we move towards making the output configurable, so that rustc starts using box drawing characters, and maybe more than 16 colors, and introducing clickable URLs, the "default" setup can get fancier, but it gives me an impetus to assume that that enforcing ASCII output for people explicitly select the ASCII output makes sense: they already are in an environment where some features will not work, we are already trying to make our tooling usable for them, it follows that we could make it so that the toolchain works as well as it can in those environments. Having said all of that, if we don't want to support an application that wants to support someone running it in a DEC, that's defensible and we can close this PR outright. I wrote it out purely because I was already looking at the code and wanted to put it out to a branch while it was fresh in my mind, not necessarily with the expectation that it would be merged, but rather as an opening to this very conversation. I think this is mostly relevant on what I would classify as retro-computing, and that is definitionally niche. |
When encountering some non-printable Unicode characters, we replace them with printable Unicode representations of them. This replacement still relies on Unicode support on the user's terminal. In order to enable use on... less modern systems, added a
force_asciimode to the renderer that replaces anything outside of the ASCII with a replacement string:instead of the current
This change unearthed a latent bug where the rendered snippet gets out of sync with the
Annotationlo and hi char position, making highlighting spans that had characters replaced with a different number of bytes to be improperly highlighted, as noted in thesvgtest.